Recognizing text in raster maps
نویسندگان
چکیده
Text labels in maps provide valuable geographic information by associating place names with locations. This information from historical maps is especially important since historical maps are very often the only source of past information about the earth. Recognizing the text labels is challenging because heterogeneous raster maps have varying image quality and complex map contents. In addition, the labels within a map do not follow a fixed orientation and can have various font types and sizes. Previous approaches typically handle a specific type of map or require intensive manual work. This paper presents a general approach that requires a small amount of user effort to semi-automatically recognize text labels in heterogeneous raster maps. Our approach exploits a few examples of text areas to extract text pixels and employs cartographic labeling principles to locate individual text labels. Each text label is then rotated automatically to horizontal and processed by conventional OCR software for character recognition. We compared our approach to a state-of-art commercial OCR product using 15 raster maps from 10 sources. Our evaluation shows that our approach enabled the commercial OCR product to handle raster maps and together produced significant higher text recognition accuracy than using the commercial OCR alone.
منابع مشابه
Automatic Text Recognition from Raster Maps
Text labels in raster maps provide valuable geospatial information by associating geospatial locations with geographical names. Although present commercial optical character recognition (OCR) products can achieve a high recognition rate on documents, text recognition on raster maps is still challenging due to the varying text orientations and the overlapping between text labels. This paper pres...
متن کاملGenerating Named Road Vector Data from Raster Maps
Raster maps contain rich road information, such as the topology and names of roads, but this information is “locked” in images and inaccessible in a geographic information system (GIS). Previous approaches for road extraction from raster maps typically handle this problem as raster-to-vector conversion and hence the extracted road vector data are line segments without the knowledge of road name...
متن کاملExtracting Road Vector Data from Raster Maps
Raster maps are an important source of road information. Because of the overlapping map features (e.g., roads and text labels) and the varying image quality, extracting road vector data from raster maps usually requires significant user input to achieve accurate results. In this paper, we present an accurate road vectorization technique that minimizes user input by combining our previous work o...
متن کاملError Detection and Correction in Toponym Recognition in Cartographic Maps
At present a lot of methods and programs for automatic text recognition exist. However there are no effective text recognition systems for graphic documents. Graphic documents usually contain a great variety of textual information. As a rule the text appears in arbitrary spatial positions, in different fonts, sizes and colors. The text can touch and overlap graphic symbols. The text meaning is ...
متن کاملResolving Ambiguities in Toponym Recognition in Cartographic Maps
To date many methods and programs for automatic text recognition exist. However there are no effective text recognition systems for graphic documents. Graphic documents usually contain a great variety of textual information. As a rule the text appears in arbitrary spatial positions, in different fonts, sizes and colors. The text can touch and overlap graphic symbols. The text meaning is semanti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- GeoInformatica
دوره 19 شماره
صفحات -
تاریخ انتشار 2015